CHAPTER 6 Taking All Kinds of Samples 81

This process ensures that your sample of 20 patients was taken completely at

random. Statistical packages like those described in Chapter 4 have RNG com-

mands similar to the one in Excel.

Learners sometimes think that as long as they sort a spreadsheet of data by a col-

umn containing any value and then select a sample of rows from the top, that they

have automatically obtained an SRS. This is not correct! If you think about it more

carefully, you will realize why. If you sort names alphabetically, you will see pat-

terns in names (such as religious names, or names associated with certain lan-

guages, countries, or ethnicities). If you sort by another identifying column, such

as email address or city of residence, you will again see patterns in the data. If you

attempt to take an SRS from such data, it will be biased, not random, and not be

representative. That is why it is important to use a column with an RNG in it for

sorting if you are taking an SRS electronically.

Taking an SRS intuitively seems like the optimal way to draw a representative

sample. However, there are caveats. In the previous example, you started with a

clinical population in the form of a printed or electronic list of patients from which

you could draw a sample. But what if you want to sample from patients presenting

to the emergency department during a particular period of time in the future?

Such a list does not exist. In a situation like that, you could use systematic sam-

pling, which is explained later in the section “Engaging in systematic

sampling.”

Another caveat of SRS is that it can miss important subgroups. Imagine that in

your list of clinic patients, only 10 percent were pediatric patients (defined as

patients under the age of 18 years). Because 10 percent of 20 is two, you may

expect that a random sample of 20 patients from a population where 10 percent

are pediatric would include two pediatric patients. But in practice, in a situation

like this, it would not be unusual for an SRS of 20 patients to include zero pediatric

patients. If your SRS needs to ensure representation by certain subgroups, then

you should consider using stratified sampling instead.

Taking a stratified sample

In the previous section, we discussed a scenario where 10 percent of the patients

of a clinic are pediatric patients, and taking a sample of 20 using an SRS from a list

of the clinic population runs the risk of not including any pediatric patients. If

pediatric patients were important to the study, then this problem can be solved

with stratified sampling. The word stratum refers to a layer (as you see in a layer

cake), and the word strata is the plural of stratum. Stratified sampling can be seen

as sampling from strata, or layers.